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Abstract - We present in this article a new eval- 
uation method for classification and segmentation 
of textured images in uncertain environments. In 
uncertain environments, real classes and boundaries 
are known with only a partial certainty given by the 
experts. Most of the time, in many presented papers, 
only classification or only segmentation are consid- 
ered and evaluated. Here, we propose to take into 
account both the classification and segmentation re- 
sults according to the certainty given by the experts. 
We present the results of this method on a fusion of 
classifiers of sonar images for a seabed characteri- 
zation. 

Keywords: Image classification, Image segmentation, 
Uncertainty environment, Sonar image. Fusion of experts, 
Fusion of classifiers. 

1 Introduction 

Textured image classification is a difficult problem in 
image processing and it is fundamental for a lot of ap- 
plications. Many features can be extracted from the 
images to classify, and many classification algorithms 
can be used [1 . Hence, it is really necessary to eval- 
uate their performance in order to compare them and 
choose the most adapted to the application. 

For instance, with satellite or sonar images, hu- 
man experts must be able to classify the types of soils 
present in the images. Many types of soils can be en- 
countered in a single image, and classification must be 
done on a local part of the image (pixel-wise, or often 
on small tiles of e.g. 16 x 16 or 32 x 32 pixels) taken 
as unit for the classification algorithm. Hence, after 
the image classification, an implicit image segmenta- 
tion is obtained according to the size of the tiles. One 
image will be segmented into several patches, each one 
corresponding to a class {e.g. a specific type of soil). 

The image classification methods are currently eval- 
uated by the confusion matrix. Good-classification 
rates and error rates are usually calculated from this 
matrix. We must know the real class of the considered 
units of the images in order to establish the confusion 
matrix. Confusion matrix does not give an evaluation 
of the produced segmentation. 

In order to evaluate the segmentation, we can not 



only consider visual comparison between the initial im- 
age and the segmented image. The image segmenta- 
tion evaluation is still a studied problem [H [31 IH [S] . 
We can consider two cases: we do not have any a 
priori knowledge of the correct segmentation, or we 
have an a priori knowledge of the correct segmenta- 
tion. Here we are in the second case because of the 
confusion matrix for which we need to get referenced 
images. In order to obtain these referenced images, ex- 
perts must manually provide the image segmentation, 
for example via a visual inspection. Zhang in [2] gives 
a review of usual discrepancy measures based on dif- 
ferent distances between the segmented-pixel and the 
referenced-pixel. Most of the time, only one measure 
of mis-segmented pixel is given. We will propose on 
the contrary, in this article a linked study of one well- 
segmented pixel measure and a mis-segmented pixel 
measure. Indeed, in general case, if a pixel is not mis- 
segmented, it is not necessary well-segmented. So we 
can have few mis-segmented pixels but also few well- 
segmented pixels: the segmentation is not good. 

We think that global image classification evaluation 
must be made by evaluating both the classification on 
considered units (with the confusion matrix) and in 
the same time by the evaluation of the produced seg- 
mentation (well-segmented pixel measure and a mis- 
segmented pixel measure) [B]. 

In real applications, it is really hard for one human 
expert to provide a certain information on the class 
and on the boundaries between the classes. For in- 
stance, the seabed characterization with sonar images 
cannot be made by human expert with a sufficient cer- 
tainty. These images, illustrating this paper, are ob- 
tained with many imperfections [7]. Figure [1] exhibits 
the differences between the interpretation and the cer- 
tainty of two sonar experts trying to differentiate the 
type of sediment (rock, cobbles, sand, ripple, silt) or 
shadow when the information is invisible (each color 
correspond to a kind of sediment and the associated 
certainty of the expert for this sediment expressed in 
terms of sure, moderately sure and not sure). 

We propose here a new approach for textured im- 
age classification and segmentation taking into account 
the information given by multiple experts and their 
certainty. In section we show how to integrate the 
expert certainty in confusion matrix and then deduce 




matrix by: 




Figure 1: Segmentation given by two experts. 

a good-classification rate and error classification rate, 
and how to fuse the different expert opinions. In sec- 
tion [3l we propose two new distance-based measures in 
order to evaluate well and mis-segmented pixel taking 
into account the experts certainties. This evaluation is 
illustrated in section U] on real sonar images, in order 
to evaluate a fusion of the classifiers presented in [7]. 
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with N the number of considered classes and Ni the 
number of element from the true class i. From this 
normalized confusion matrix a good-classification rate 
vector can be written as: 

GCRi = Ncmu, (2) 

and an error classification rate vector as: 
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This error classification rate is the mean of the two 
errors corresponding to the elements from a given class 
i classified in another class (first term), and corre- 
sponding to the elements classified in a given class j 
being from another class i (second term). We do not 
have to normalize the first term because of the nor- 
malization of the confusion matrix on the rows, but 
the second term must be normalized by the number 
of rows minus one (because of the Ncma term corre- 
sponds to the good-classification). 

Thus image classification algorithms evaluation 
must be made not only on one image but on the whole 
images database. As a consequence, we have to con- 
sider a non-normalized confusion matrix on each image 
and normalize the sum of the matrix confusion on all 
images of the database. 



2 Image classification evaluation 

In this section, we propose an original evaluation ap- 
proach for classification based on a new confusion ma- 
trix taking into account the uncertainty and the possi- 
bility that one unit belongs more than one class. This 
evaluation approach is adapted to the image classifi- 
cation evaluation, but can be used for any classifier 
evaluation. 



2.1 Classical Evaluation 

A first step of the classical classification evaluation can 
be made by comparing the results of the classifier to the 
reality. But in order to evaluate a classification algo- 
rithm, many different configurations and tests must be 
considered. Classification algorithms can yield many 
variable results depending on the sample. Most of the 
time, classification algorithms evaluation is conducted 
by the confusion matrix. 

Confusion matrix is composed by the number crriij 
of elements of the class i classified in the class j. In 
order to obtain rates making it easier to compare dif- 
ferent size of databases, we normalize this confusion 



2.2 Evaluation with certainty given by 
each expert 

We consider here a general case where information is 
given by the expert on each pixel and the classification 
algorithm is made on an unit of n x n pixels. Hence on 
each unit, more than one class can be present. Gen- 
erally, the classification algorithms can find only one 
of these classes. In order to take into account the in- 
homogeneous units, consider that if the classification 
algorithm finds one of these classes on the unit, the 
algorithm is right in the proportion of this found class 
in the n x n pixels-unit and it is wrong in the propor- 
tion of the other classes in the considered unit. For 
instance, imagine the case where the expert considers 
a tile of size 16 x 16 pixels and declares that on a part 
of the unit, 50 given pixels belong to class 1, and 206 
other pixels belong to class 3. If the classification al- 
gorithm finds the unit belongs to class 1, the confusion 
matrix will be computed by the recurrence relations: 
emu ^ emu + 50/256 and cmsi ^ cm^i + 206/256. 
Hence the confusion matrix is not composed of integer 
numbers and Ni is also not integer; but the sums of 
column arc still integers. 

If the expert can give the class with a certainty 
grade, we must not take equally two different grades 
in our classification evaluation. For instance, in sonar 



application, the operator can be sure that one part of 
the image as belonging to rock, and be totally doubt- 
ful on another part of the image. Classical confusion 
matrices suppose that the reality is perfectly known 
and that is rarely the case especially in image classifi- 
cation. We propose to graduate this difference of infor- 
mation by different weights corresponding to the differ- 
ent grades of certainty that are considered. In the con- 
fusion matrix, such weights could be integrated easily 
in the general sum. For example, consider three grades 
of certainty (sure, moderately sure and not sure), we 
can choose respectively the weights: 2/3, 1/2 and 1/3. 
If one expert labels a unit as belonging to the class 
1 {e.g. rock), with a moderate certainty, and if the 
classification algorithm finds the class 1, considering 
the previous given weights, the confusion matrix will 
be updated such as: emu ^ emu + 1/2. If the 
classification algorithm finds the class 2 {e.g. sand) 
on the considered unit, the confusion matrix becomes 
CTO12 ^ emi2 + 1/2. Hence the sums of columns are 
not integer anymore. 

In order to fuse the referenced images provided by 
different experts, we can compare the classified image 
with all the referenced images by the experts. Hence 
we obtain as many non-normalized confusion matrices 
as experts, and we can simply combine them by ad- 
dition. This can be done also if the experts do not 
provide certainty, in such a case the weight is 1 for all 
units. 

By the simple addition of the non-normalized con- 
fusion matrices, we weight the obtained results by the 
image size or the considered unit number. 

In order to obtained rates, we normalize the ob- 
tained confusion matrix with equation ([T]) and calcu- 
late the good-classification rate vector with equation 
and the error classification rate vector with equa- 
tion ([3]). Of course these rates are not percentages 
anymore. For instance, the good-classification rate is 
no longer the percentage of well classified units, be- 
cause the weights given by the inhomogeneous units 
or by the expert certainty are rational. These newly 
obtained confusion matrix, good-classification rate and 
error classification rate give a good evaluation of clas- 
sification taking into account the inhomogeneous units 
and certainty of the experts. This approach can be 
applied in every domain where we try to classify un- 
certain elements, and not only in image classification. 

3 Segmentation Evaluation 

Image classification provides an implicit image segmen- 
tation, the boundaries are given by the difference of 
classes between two adjacent tiles. A good image clas- 
sification evaluation has to study this obtained image 
segmentation. 

Many approaches can be considered in order to ob- 
tain boundaries. This is not the subject of this paper 
and the following segmentation evaluation can be ap- 
plied to all image segmentations given by boundaries 
as a succession of pixels. 



We propose here a linked study of one well- 
segmented pixel measure and a mis-segmented pixel 
measure. Generally one of these measures is consid- 
ered in the case with an a priori knowledge [2 El E] • 
The well-segmented pixel measure is a well-detection 
boundary measure and the mis-segmented pixel mea- 
sure is a false detection boundary measure. We show 
how these two measures can take into account the un- 
certainty of the expert on the position and existence of 
the boundaries, assuming that each certainty grade is 
represented by a weight. 

3.1 Well-detection boundary measure 

First, for each found boundary pixel /, search the mini- 
mal distance dfe between / and all the boundary pixels 
provided by the expert e. Hence the pixel e is a func- 
tion of /, and we should note it as e/, but in order to 
simplify notations, it is referred to as e in the rest of 
the paper. We take here an Euclidean distance but any 
other distance can be envisaged. The certainty weight 
of the pixel e given by the expert is noted as We. We 
define a well-detection criterion vector by: 

DCf = cM-{dfe-We)^).We. (4) 

This criterion gives a Gaussian-kind distribution of 
weights with a standard deviation given by the cer- 
tainty weights, as shown in figured] 




Figure 2: Distance weight for the well-detection crite- 
rion. 

The well-detection boundary measure is defined by 
the normalized well-detection criterion given by: 

Hence, this measure is defined between and 1. In real 
applications, this criterion remains small even for very 
good boundary detection, so we can take a = 1/6 in 
order to accentuate small values. 

This criterion only takes into account the distanee 
from the found boundary to the contour provided by 
the expert. However, the reference boundary has a 
local direetion which is another aspect we have to con- 
sider. Indeed, for instance, a found boundary can cross 
a given boundary orthogonally: in this case some pix- 
els from the found boundary are very near (in terms 
of distance) to pixels from the reference boundary but 
that is not a good detection. 



In order to take into account the local direction, we 
count, for a given pixel / of the found boundary, how 
many pixels from the found boundary are linked by the 
minimal distance to the same pixel e of the reference 
boundary. This number is noted n^f^ e.g. on figure [3] 
we have Uef — 3 for three different /. We redefine the 
well-detection boundary measure by: 



WDC 



'EfDCf/Uef 



(6) 




Figure 3: Example of rief for three given /, the found 
boundary is represented by green squares and the ref- 
erenced boundary by a black line. 

The problem is that the number rigf does not ad- 
equately represent a number of pixels on the same 
boundary and take into account only orthogonal di- 
rection. However this measure gives a good evaluation 
of the proportion of the found boundaries. 

3.2 False detection boundary measure 

The false detection boundary measure is based on the 
same principle as the well-detected boundary measure, 
but the Gaussian-kind distribution of weights must be 
inversed. Hence we can defined a false detection crite- 
rion by: 

FDCf = 1 - DCf/We, (7) 

where the pixels / and e are linked by the minimal 
distance dje. As a consequence, the false detection 
boundary measure can be defined by the normalized 
false detection criterion by: 



FD = 1 



exp 



EfiFDCf.n^f) 
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Here we have described the two measures FD and 
WDC that compare two images: one image classified 
by the algorithm and the other one provided by only 
one expert. In order to evaluate image segmentation al- 
gorithms on many images and/or fuse the expert opin- 
ions, we can use a weighted sum of these both mea- 
sures. The weights are given by the image sizes, which 
can be different for all considered images. 

4 Fusion of classifiers of sonar 



presented in [7]. Indeed, underwater environment is a 
very uncertain environment and it is particularly im- 
portant to classify seabed for numerous applications 
such as Autonomous Underwater Vehicle navigation. 
In recent sonar works {e.g. [10l[TT]), the classification 
evaluation is made only by visual comparison of one 
original image and the classified image. That is not 
satisfying in order to correctly evaluate image classifi- 
cation and segmentation. 

4.1 Database 

Our database contains 42 sonar images provided 
by the GESMA (Groupe d'Etudes Sous-Marines de 
I'Atlantique). These images were obtained with a Klein 
5400 lateral sonar with a resolution of 20 to 30 cm in 
azimuth and 3 cm in range. The sea-bottom depth was 
between 15 m and 40 m. 

Three experts have manually segmented these im- 
ages giving the kind of sediment (rock, cobble, sand, 
silt, ripple (horizontal, vertical or at 45 degrees)), 
shadow or other (typically ships) parts on images, 
helped by the manual segmentation interface presented 
in figure 31 All sediments are given with a certainty 
level (sure, moderately sure or not sure), and the 
boundary between two sediments is also given with a 
certainty (sure, moderately sure or not sure). Hence, 
every pixel of every image is labeled as being either 
a certain type of sediment or a shadow or other, or a 
boundary with one of the three certainty levels. We 
choose the weights: 2/3, 1/2 and 1/3, for respectively 
the certainty levels: sure, moderately sure and not 
sure. The proportion of each sediment given by the 
three experts are given in the table [T] Note that the 
proportion of the different sediment are very different 
and that can be a problem for the classification. The 
proportions are very similar for the three experts. We 
see that sand and silt are the most present and the 
shadow and other are very few represented on these 
images. 
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Figure 4: Manual Segmentation Interface. 



images 



4.2 Fusion approaches 



We present here our image classification and segmenta- 
tion evaluation in a fusion of classifiers of sonar images 



We consider here four methods of features extrac- 
tion based on four representations of the image: 



Table 1: Proportion of sediment in the database (%) 





Expert 1 


Expert 2 


Expert 3 


Rock 


9.64 


9.62 


12.78 


Cobble 


6.00 


3.71 


8.42 


Ripple 


13.96 


15.98 


13.53 


Sand 


26.97 


35.62 


28.40 


Silt 


42.85 


34.57 


35.20 


Shadow 


0.55 


0.44 


0.26 


Other 


0.10 


0.05 


1.40 



co-occurrence matrices, run-lengths matrix, wavelet 
transform and Gabor filters [7]. They provide respec- 
tively 24, 20, 63 and 4 parameters. These four feature 
sets are independently considered as the inputs of a 
multilayer perceptron (MLP) classifier presented in |T. 
In order to illustrate our evaluation approach only two 
of the classifiers fusion presented in [7] are considered 
coming from the evidence theory. 

The evidence theory is based on basic belief assign- 
ments (bba) defined by mapping of each subset of the 
space of discernment 9 = {Ci,,C„} onto [0,1], such 
that: 

E "^(^) = 1' (9) 

XG20 

where m(.) represents the bba. 

The principal difficulty is the choice of a bba ac- 
cording to the application. We can consider two types 
of approaches: one based on a probabilistic model [12] 
and another one based on distance transformation [T3] . 
Appriou in [12J proposes two equivalent models based 
on three axioms. The first one that we use in this arti- 
cle in order to fuse the decisions of the four classifiers 
is given by: 

''tjlU^l/A-^J — l + Rjp(qj\Ci) 

mji{Q){x) = 1 - Q!y 

where qj is the j^^ classifier (supposed cognitively 
independent), j = l,...,m, aij are reliability coeffi- 
cients on each classifier j for each class i = l,...,n 
(in our application we take = 1), and Rj = 

(maxg^.,ip(gj|C,)) \ 

The approach proposed in [T3] is used in order to 
fuse the numerical outputs of the four classifiers. The 
bba are given by: 

r mj,({a}/a;(*))(a;) = a„</.,(d(*)) 

{ (11) 
[ m,,(e/a;(*))(a;) = 1 - a,,(^,(d(*)) 

where (a;^*-') is a set of learning vectors, c?*^*-' — d{x,x'^*^ ) 
is a distance between x and x*^*^ and Ci is the class of 
a;*-*-*, ifi is a distance function given by: 

ifiiid) ^ exp{-iyi(f ), (12) 



where Vi is a positive parameter associated to the class 
a. 

The combination of the bba is based on the orthog- 
onal non-normalized Dempster-Shafer's rule given in 
[U for aU X e2'^ by: 

M 

™w= E n"^^(^j)' (13) 

Yin...nYM=x j=i 

where Yj G 2® is the response of the expert j, and 
mj(Yj) the associated belief function. In order to con- 
duct the decision, we consider the maximum of pignis- 
tic probability [15] . 

4.3 Evaluation 

Here, we consider six different classes given by the table 
m The images are considered as a succession of tiles of 
size 32x32 pixels. Hence the 42 images provide 38997 
tiles, units for the classification. The proportion of the 
number of different sediments on a tile is given in the 
table [3] for each expert. These proportions are very 
similar for the three experts. 

Table 2: Repartition of the kind of sediment in classes 



class 


sediment 


class 1 


rock 


class 2 


cobble 


class 3 


ripple 


class 4 


sand 


class 5 


silt 


class 6 


shadow and other 



Table 3: Proportion of number of different kind of sed- 
iments on the tiles (%) 





Expert 1 


Expert 2 


Expert 3 


1 sediment 


77.79 


79.65 


79.94 


2 sediments 


20.70 


19.30 


19.33 


3 sediments 


1.48 


1.03 


0.72 


4 sediments 


0.04 








5 sediments 











6 sediments 












The total conflict between the three experts is 
0.2244. This conflict comes essentially from the dif- 
ference of opinion of the experts and not from the tiles 
with more than one sediment. Indeed, we have a weak 
auto- conflict (conflict coming from the combination of 
the same expert three times). The values of the auto- 
conflict for the three experts are: 0.0496, 0.0474, and 
0.0414. 

The database is divided into three parts. The flrst 
part composed of 20 images (with only 12505 tiles) is 
used for the learning step of the multilayer perceptron. 
A second part of 10 images (composed of 12650 tiles) 



serves the learning step of both the fusion approaches. 
The used information for these learning stages are only 
considered given by one of the three experts (expert 1). 
The last 12 images (corresponding to 13841 tiles) are 
used in order to evaluate the classifier fusion methods, 
considering the information given by the two other ex- 
perts. 

The figure [5] describes the manual segmentation 
made by one expert and the automatic classification 
reached by both classifier fusion methods. The dark 
blue part corresponds to the non considered part of 
image. First at all if we look on figure [5] the results 
of the classification of the same image, we note that 
the sediments are quite well classified. However, just 
looking this figure [5] we can not say if the classification 
is good or not, and if one fusion approach is better or 
not: it remains very subjective. Moreover it could be 
good for this image and not for others. So we propose 
to use our measures. 



by for the probabilistic approach: 
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Figure 5: Manual segmentation (first) and automatic 
segmentation given by the probabilistic approach (sec- 
ond) and the distance approach (third). 

First we compare the obtained results to the infor- 
mations given by only the expert 2. The obtained nor- 
malized confusion matrix on the test database is given 
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We note that the distance approach does not clas- 
sify rock and other. The most of tiles are classified 
in ripple and silt and few in sand. The probabilistic 
approach provides a full confusion matrix. In order 
to summarize these results, we can give the vector of 
good-classification rate and the vector of error classifi- 
cation rate given by [0 24.62 47.70 59.51 82.10 30.97] 
and [94.30 59.13 54.55 82.84 71.18 148.05] for the prob- 
abihstic approach and by [0 32.05 51.51 28.94 95.44 0] 
and [50.00 64.03 144.84 72.88 149.43 50.00] for the dis- 
tance approach. We recall that is not a percentage be- 
cause of the weights. The vector of good-classification 
rates can provide a mean of good-classification rate. 
We obtain here 62.43 for the probabilistic approach 
and 50.55 for the distance approach. These results 
tend to prove that the probabilistic approach gives bet- 
ter results than the distance approach. We can also 
study the difference on homogeneous tiles and inhomo- 
geneous tiles. For instance, for the probabilistic-based 
approach, the normalized confusion matrix on homo- 
geneous tiles is given by: 
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100.00 
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and on inhomogeneous tiles: 



rock 
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sand 


sih 


other 


25.50 


11.12 


7.41 


15.09 


14.14 


26.73 


20.97 


17.84 


8.32 


34.02 


8.15 


10.71 


13.50 


5.71 
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13.02 


11.21 


5.79 


11.86 


44.95 


21.33 


4.85 


13.79 


3.24 


10.24 


16.40 


53.55 


2.77 


39.46 











29.58 


30.97 


observe an 


important difference. 


The 



We observe an important difference. The good- 
classification rate is better on the homogeneous tiles 
(62.43) than on the inhomogeneous tiles (39.99). Hence 



the classification of tlie inliomogeneous tiles is a real 
difficulty. 

The figure [H] seems to show that the segmentation of 
the distance approach is better than the probabilistic 
approach. We have to evaluate the segmentation pro- 
ducted by the classification with our measures. Note 
that this evaluation is highly depending on the size 
of the tile, here: 32x32 pixels. Our proposed mea- 
sures, given respectively by the equations ^ and ^ 
expressed in percentage, provide in the case of prob- 
abilistic approach 59.84 for the well-detection crite- 
rion and 45.64 for the false alarm criterion, and for 
the distance approach 57.22 for the well-detection cri- 
terion and 48.54 for the false alarm criterion. The 
well-detection criterion and the false alarm criterion 
of the probabilistic-based fusion are better than the 
well-detection criterion of the distance-based fusion. 
However, we have to take care of both measures that 
are studying together. Indeed, on the figure [5l the 
probabilistic-based method provides a lot of bound- 
aries, and so the chance to contain well-detection cri- 
terion increases, but the false alarm increases also. 

In order to confirm these results, we can fuse easily 
these measures with the resulted measures obtained 
with the expert 3. The good-classification rate and 
error classification rate vectors are respectively given 
by [26.73 14.54 39.83 60.56 81.83 0] and [61.71 57.01 
58.47 109.05 111.89 74.16] for the probabilistic-based 
method and [0 17.94 48.02 30.09 95.83 0] and [50.00 
63.31 70.10 87.40 169.54 50.00] for the distance-based 
method. The mean of the good-classification rate is 
58.39 for the probabilistic-based method and 49.24 for 
the distance-based method. The results of the seg- 
mentation evaluation are given by the well-detection 
criterion and the false alarm criterion: respectively 
62.76 and 54.57 for the probabilistic-based approach 
and 60.83 and 55.90 for the distance-based approach. 
The fusion of measures originally from the experts 
shows that the probabilistic-based method is better 
than the distance-based method. However the differ- 
ence is lower than with only one expert. 

5 Conclusions 

We have proposed a new evaluation of the image clas- 
sification and segmentation based on new measures in 
uncertain environments. In order to achieve a good 
evaluation of the image classification, we have seen that 
a linked study of the classification and of the produced 
segmentation is necessary. The proposed classification 
evaluation can be used independently for every kind 
of uncertain units classification, e.g. is a basic belief 
assignment is associated to the units. The proposed 
segmentation evaluation can be used for all image seg- 
mentation approaches and not only for a segmentation 
produced by a classifier. The proposed confusion ma- 
trix takes into account the uncertainty of the expert 
and also the inhomogeneous units {e.g. patch- worked 
images in the case of image classification). Moreover 
we have defined good-classification and errors classifi- 
cation rates from our confusion matrix. The proposed 



segmentation evaluation considers good and false de- 
tection boundary measures where the subjectivity of 
the expert is considered by the given uncertainty. 

In our proposed evaluation approach, the fusion of 
experts opinions is made by the fusion of our differ- 
ent measures calculated for each expert. This fusion is 
made by using a simple sum: the uncertainty is con- 
sidered directly in our measures. It can be interesting 
to fuse the informations provided by experts before the 
evaluation in order to obtain an uncertain and impre- 
cise reality. This new reality can used for instance for 
learning and also for the evaluation of classifiers. 
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